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GCCTCTCTAG GCCGGCGGGT CCTCCQCTCC ATGGTCCTGT CTGTCAGCGr 
61 TGTGTCAGGA GGCCAGTGCC GAGGTCCGGT CGCGCTCCGA CGCtJcSS CTCgSSSS? 
121 TCGCGGGTAT CCCGGCGGCC GCG6GACGAT GGCGTGGTGG cSSSgS SSSgSS? 
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661 AATGTCTAGA GTTGGGCCCT TTGGACACTC TGCTGGCACT TGGGCCCCAT SccttSpp 
111; TCCAGATGGG 6TCTGGCCCA AGTCTGAgS SaAcSSSgA SSSJS^CT 

781 CTGTTGGTGG AGAGATAATG AGGTCCCATC ATAAAGGCAG GTAgSgCCA TGAtJI???! 
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^^A^ TCTCTGGAAG TGAGCAGGCA GCCAGCTTCT ACTGGACCTC AACTGTgSS GqS??SS^ 
^tt"; InJS,^^?:'^'''^ TTTGAGTTCT GTGATAGGGA GGGTGTACTA AaStgSS 

TCTTCCAAGT GGTTTCCTCA GGAAGGGCTG GCAGCTOTCC TTCctSgtS 
1561 CATAAATACA CTATTTTCCA ATC «wt«v.i\aj.v,v, lAt-t-rAQGTA 
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TCTAGGCCGGCAGCGCCTCTCCTCCATGGTCCTGTCTGTCAGCGCTGTTTTGGGAGCCCGCCGGTGAGGC 
CGGGCCACGCTCAGACACTTCGATCGTCGAGTCTGTCACTGGGCATGGCGGGTCAGTTCCGCAGCTACGT 
GTGGGACCCGCTGCTGATCCTGTCGCAGATCGTCCTCATGCAGACCGTGTATTACGGCTCGCTGGGCCTG 
TGGCTGGCGCTGGTGGACGGGCTAGTGCGACAGCCCCTCGCTGGACCAGATGTTCGACGCCGAGATCCTG 
GGCTTTTCCACCCCTCCAGGCCGGCTCTCCATGATGTCCTTCATCCTCAACGCCCTCACCTGTGCCCTGG 
GCTTGCTGTACTTCATCCGGCGAGGAAAGCAGTGTCTGGATTTCACTGTCACTGTCCATTTCTTTCACCT 
CCTGGGCTGCTGGTTCTACAGCTCCCGTTTCCCCTCGGCGCTGACCTGGTGGCTGGTCCAAGCCGTGTGC 
ATTGCACTCATGGCTGTCATCGGGGAGTACCTGTGCATGCGGACGGAGCTCAAGGAGATACCCCTCAACT 
CAGCCCCTAAATCCAATGTCTAGAATCAGGCCCTTTGGACATCCTGCTGACACTTGGGCCCCTTAACACC 
TTGGGCTGCTCAGACCCTCCAGATGAGGTCCAGCCCAGATCTGAGAGGAACCCTG6AAATGTGAAGTCTC 
TGTTGGTTTGGGAGAGATAGTGAGGGCCTGTCAAAGAAGGCAGGTAGCAGTCAGCATGACAGCTGCAAGA 
ATQACCTCTGTCTGTTGAAGCCTTGGTATCTGAQAGGTCAGGAAGGGGACCTCTTTGAGGGTAATAACAG 
AATTGGAACCATGCCACTCTTGAGCCy^CAATACCTGTCACCAGCCTGTTGTTTTAAGAGAGAAAAAAAAT 
CTUIGGATATCTGATTGGAGCAAACCACTTCTTTAGTCATCTGTCTTACCCCCCTGGGACAGCTGTTACCT 
TTGCAGTGTTGCCGAATCACAGCAGTTACCTTTGCAGTGTTGCCGAATCACAGCAGTTCTGTTGGAGAAA 
CGCTTGGTTTCCGGATCCAGAGCCACAGAAAGAAATGTAGGTGTGAAGTATTAGGCTGCTGTCA6GGAGA 
GGATGGCAGATGGAGGCATCAAGCACAAGGAAAATGCACAACCTGTGCCCTGTTATACACACGTTCATGT 
GCACCCAAGAACCTATGACTTTCTTCCAGTTCCTTCTACCAGGTCCCCATCCTGCTGCCAGCTCTCAACA 
TAGCAGGCCATAGGACCCAGAGAAGAATCCCAGCGTTGCTCAAAGTCTAACCATCATAAAGACACTGCCT 
GTCTTCTAGGAATGACCAGGCACCCAGCTCCCACTGGACTCCAATTTTTTTTCCTGCCTTATTTAGAATT 
CTTTGGCGGGAAGGGTATGATGGGTTCCCAGAGACAAGAAGCCCAACCTTCTGGCCTGGGCTGTGCTGAT 
AGTGCTGAGGGAGATAGGAATTTGCTGCTAAGATTTTTCTTTGGGGTGGAGTTTCCTCTGTGAGGGGCTT 
GCaVGCTATCCTTCCTGTGTATACAAATACAGTATTTTCCATGGTTCTGCCTGCACTTACTTTGTAATGCC 
ACGGTTGAGATTGAGAGAGATCAGCGCAGCCAGGCAAGGGAACTTTAAAGAATTATTAGGCCACCTTCTC 
CCTTTCCTGGACCCCAGAGTCATTCCTCCATTTGGTTAAAATACTCAGTGCAGGGAACTCTTACATCCTG 
TCTCCTTCACTTGCAGCGTCCCCTGCTATGCCTCAGGTGAACCACATAATTCTTGGGTTTCCGTTCCTAC 
TTGCTAGTGATTTCTGAACATGTTCAATGGAGCGGCACACAGTCTAGACCCACTTCCGCATTGAAACCTT 
CACTGTTCCTCTTTGGTTTCTTCAGAGCTTTCCCAAGAGAGCTGTCAGTTTTCAGCTGTCAGTAACACAA 
ATGAGTTTATGGTAACACAAATGAGTTTTGCTATCTCTCTGAGAAGCTCATCTGACCTCCTGACTCTCAG 
CCCTACAGAGTAGGGAGTTGATGCTGACAGGATGAAGATTTAGGAATAAATATGCCTGGGAAGAGACTGG 
GAAGGTTCTAGGQTGAGGCACCTCAGTAACTCATGGTACCTTGGCCAAGTTGGAAGGAAGCAGTTTGTTA 
ATGAGGCACAGTAATCCTGGCTGCAGGGTCTAGGAGGTAAGACCAGCTGGGATGACCTTCCCTGGGTTAA 
Ta?UVTTTCCCTCTAGACAACAC?U^CTGCAGGC^ 

GCTGTCACCCTTGACCAGCCGTGGTGGTGGTTACTCCATCTGTGGTTGGAGCGCCTCTTTGGGATTCACT 

TCAAGGTCTTGTGCCTATTTTTCTGCATATCTTCTGTGATGACAAATCTCTGTCCCCTGAGTGTTAATTT 

GATTTTTAGAAATGGCCAAAAGTCACGTGATCCAAACTTTTTTTCAGTAATATGGAGACTGAGCTGCATG 

GTAGTTGGGGATCAAAAATATGTGACCTTAATGAGATTTTTATGATTTCTAAAGTAACAATAAAAGCAGT 

TTTTAGAGTTGAGTTCCAGAGAGGGCAGGGCAATGGCAGTGACATGTTTGTCATTTTAATAATAAATAAC 
ATCTATTGAGTGCTTAA 
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ATGGCGGGTCAGTTCCGCAGCTACGTGTGGGACCCGCTGCTGATCCTGTCGCAGATCGTCCTCATGCAGA 
CCGTGTATTACGGCTCGCTGGGCCTGTGGCTGGCGCTGGTGGACGGGCTAGTGCGACAGCCCCTCGCTGG 
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GLWLALVDAL VRKPVPGPDV RRGDPGLLHP 
PGFHCHCAFL SPPGLLALQL PFPLGADLVA 
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MAGQFRSYVW DPLLILSQIV LMQTVYYGSL GLWLALVDAL VRSSPSLDQM FDAEILGFST 
PPGRLSMMSP VLNALTCALG LLYFIRRGKQ CLDFTVTVHF FHLLGCWLYS SRFPSALTWW 
LVQAVCIALM AVIGEYLCMR TELKEVPLSS APKSNV 



Figure SOD 
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PFPGSRGPQL FGIiSRPAGPP LHGPVCQRCV 
VALTGAGGCR APRAGMAGQP RSYVWDPLLI 
PGPDVRRGDP GLLHPSRPAL NDVLRPQRPH 
IiAIjQLPFPLG ADLVAGPGCV HCTHGRHRGV 



RRPVPRSGRA PTIiRPSSRSR VSKEtPRDDQV 
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Figure 30E 



MAGQFRSYVW DPLLILSQIV LMQTVYYGSli GLWIiALVDAL VRKPVPGPDV RRGDPGLLHP 
SRPALNDVLR PQRPHLCPGIi AVUJPAREAV PGPHCHCAFL SPPGLIALQL PFPLGADLVA 
GPGCVHCTHG RHRGVPVHAD GAQGDPPQtiS P 



Figure 30F 



MAGQFRSYVW DPLDILSQXV LMQTVYYGSIi 
SRPALHDVLH PQRPHLCPGL AVLHPARKAV 
GPSRVHCTHG CHRGVPVHAD GAQGDTPQLS 



GLWIiALVDGIi VRQPLAGDPV RRRDPGLFHP 
SGPHCHCPPL SPPGLIiVLQL PFPLGADLVA 
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GTCCGGAGGA 
AACGTGACAC 
TTTGAGGACT 
TGGACAGAAA 
ACCCCTGGTG 
AGCAAGCGGA 
AAGGGAGCA6 
TGTGGCAACA 
GAGGAGCAGT 
CGCTCGCTGA 
CTGACTGGTG 
AACGCTGGCA 
GCCAAGAACG 
AACCGCGGGG 
GTCATCTCTG 
CTGGCCTGCC 
ATTGTTCGGC 
AATGACGTCA 
AAGCAGGCCT 
CTCATGGTGC 
CACAGGAGCC 



GGGCGCGCGC 
GGCCGCGCGG 
CCCGGCCCGC 
GGAGCGCAGG 
GCCACAAGAA 
GCGGTGGAGG 
AGCGGTACCC 
GGGTGTTGTT 
CGCAGTTCGT 
TCGTGCTGGC 
ACAAGGAGAT 
AGAGTTCAAA 
CTGACATGAT 
AGCTGGATGG 
CCACGGCTGC 
ACATTCACAA 
GTCTGAGCAT 
GCGTTGTTCT 
GTAAGATTGG 
TGGTGGTGGT 
AGATCATCCG 
TGGAGATGGG 
CCGTGGTTCG 
ACAAGACAGG 
TGGCCTACGG 
AGCAATCCCA 
CCATGAGCAG 
CCGTGTACGA 
CCTGCCGAGT 
GTGTGGGACT 
ACCAGGTCCT 
TGGGCATCAT 
ACGTCGTCAT 
TGGCCCGGGA 
ACCAACACTT 
AGGTGGCCAC 
TGGAGGACCA 
TCAAGGTTTG 
CACATCTGGT 
AGGCCCACCT 
GAGACTCCCT 
AGTGCCCGGC 
TGCTCCAAGA 
GCATGATCCA 
CGCTGGCAGC 
ACGGTCGGAA 
TCTGCATCAG 



GTGGGCGCAG 
GCGGCTCCGC 
CCGCTGCCGG 
GGGCGCGGCG 
GCGGGTGGAC 
GGAGCGCAGG 
TCGAAATGTC 
CAGCCAGTTC 
CCCAGAGATG 
TGTCACCATC 
GAACTCCCAG 
CATCCAGGTG 
CTTCCTGAGG 
AGAGACAGAC 
TGACCTCCTG 
CTTCCTGGGG 
TGAGAACACG 
CTACACTGGC 
CCTGTTCGAC 
GTCCCTGGTC 
CTTCCTGCTC 
CAAGATCGTG 
TTCCAGCACA 
AACCCTGACC 
CCTGGACTCC 
GGATCCACCT 
CCGTGTCCAC 
GTCCAATGGT 
GTACCAGGCA 
GACGCTGGTG 
GAATCTCACC 
CGTGCGGGAT 
GGCTGGCATT 
GGGACTACGT 
TGAAGCCCGC 
GGTGATCGAG 
GCTGCAGGCA 
GATGCTAACA 
GACCAGAAAC 
GGAGCTGAAT 
GGAGGTTTGC 
TQTGGTGTGC 
ACGCACCGGG 
GGAATCCGAC 
GGACTTCTCC 
CAGCTACAAG 
CACCATGCAG 



CGCGGAGCGG 
QCGTGGCCCG 
CGCCTCCTCC 
GGCGGCGACA 
AGTAGGCCGC 
CCCCGTACTG 
ATCAACAACC 
AGATACTTCT 
AGGCTTGGCG 
ATCCGTGAGG 
GTCTACAGCC 
GGAGACCTCA 
ACGTCAGAGA 
TGGAAGCTTC 
CAGATTCGGT 
ACTTTCACCA 
CTGTGGGCCG 
AGAA/yVCTGC 
CTGGAGGTGA 
ATGGTGGCCC 
CTGTTTTCCA 
TACAGCTGGG 
ATTCCTGAGC 
CAGAATGAGA 
ATGGACGAAG 
GCTCAGAAGG 
GAGGCTGTGA 
GTGACGGACC 
TCCAGCCCGG 
GGTCGAGACC 
ATCCTTCAGG 
GAGTCCACGG 
GTCCAGTACA 
GTGCTGGTGG 
TACGTCCAGG 
AGCTTGGAGA 
GATGTCAGGC 
GGGGACAAGC 
CAAGATATCC 
GCCTTCCGTA 
CTCAAATACT 
TGCCGCTGTG 
AAACTCACCT 
TGCGGCGTGG 
ATCACCCAGT 
CGCTCGGCGG 
GCTGTCTTCT 



GGCCCATGGT 
GCGCTCGCGA 
CCCTGCCCCG 
TGACGGACAG 
GCGCGGGGTG 
TCTGGTTGGG 
AGAAGTACAA 
TCAACTTCTA 
CCCT6TACAC 
CAGTAGAGGA 
GGCTCACGTC 
TCCTTGTGGA 
AAAACGGCTC 
GGCTCCCGGT 
CCTATGTGTA 
GGGAAAACAG 
GCACCGTCAT 
GGAGTGTCAT 
ACTGCCTCAC 
TGCAGCACTT 
ACATCATTCC 
TGATCCGCAG 
AGCTGGGCAG 
TGGTGTTCAA 
TGCAGAGTCA 
GCCCCACGGT 
AGGCCATTGC 
AGGCTGAGGC 
ATGAGGTGGC 
AGTCCTCCAT 
TCTTCCCGTT 
GGGAAATCAC 
ACGACTGGCT 
TAGCCAAGAA 
CTAAGCTGAG 
TGGAGATGGA 
CCACGCTGGA 
TGGAGACAGC 
ATGTTTTCCG 
GGAAGCATGA 
ATGAGTACGA 
CCCCAACCCA 
GTGCAGTATG 
GCGTGGAGGG 
TCAAGCATCT 
CCCTCAGTCA 
CGTCTGTGTT 



GCGGCCGTGT 
CCTCGCCCCT 
GGGCGGCGCG 
CATCCCGCTG 
CTGTGAGTGG 
ACACCCCGAG 
TTTCTTCACA 
CTTCCTGCTT 
CTACTGGGTT 
GATCCGATGT 
ACGAGGGACC 
AAAGAACCAG 
TTGCTTCTTG 
GGCCTGCACA 
CGCTGAAAAA 
TGACCCTCCG 
AGCATCAGGC 
GAATACTTCC 
CAAAATCCTG 
TGCCGGCCGC 
TATCAGCTTG 
GGATTCC7VAA 
GATTTCGTAC 
GCGGCTGCAC 
CATCTTCAGC 
CACCACCAAG 
ACTCTGCCAC 
TGAGAAGCAG 
TCTGGTCCAG 
GCAGCTGAGG 
CACCTATGAG 
GTTCTACATG 
GGAGGAGGAG 
GTCCCTCACA 
TGTGCATGAC 
GCTGCTGTGC 
GACGCTGCGC 
CACGTGCACA 
ACTGGTGACC 
CTGTGCCCTG 
GTTCATGGAA 
GAAGGCCCAG 
GGACGGAGGC 
CAAGGAAGGG 
CGGCCGCTTG 
GTTTGTGATC 
CTACTTTGCA 
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2821 
2881 
2941 
3001 
3061 
3121 
3181 
3241 
3301 
3361 
3421 
3481 
3541 
3601 
3661 



TCCGTTCCTC 
CCCGTGTTCT 
GAGCTCTACA 
GTGTTAATCA 
TCGGAGTTTG 
ATGGTGGCGC 
CTGGCCTGCT 
GCCACCCTGT 
TATGTCCTCA 
TAAGCTGCAG 
TCAAGTTCCA 
GCTTCGCTGA 
GGGACCTGAG 
GTGGGACCGG 
TGAACCTCTT 



TCTACCAAGG 
CCCTGGTTTT 
AGGACCTGCT 
GCATCTATCA 
TACACATCGT 
TCACGATCCA 
ACATTGCCTC 
CATTCCTCTG 
AGTACCTGCG 
GGCTGCCTCG 
CACGCACGAG 
GGCGACACTG 
AGCTGTACCT 
ATGGCCCGTC 
GCCTGCAGCC 



CTTCCTGATC 

GGACAAAGAC 

TAAGGGGCGG 

AGGGAGCACe 

GGCAATCTCC 

GACGTGGCAC 

CCTGGTGTTC 

GAAGGTGTCC 

GAGACGGTTC 

GGCAGGGCCT 

CCGCCTCTGC 

GGCACCTAAT 

ATCAGAACCT 

TGAGGTTTGT 

CGGGG 



ATTGGGTATT 
GTGAAGTCGG 
CCACTGTCCT 
ATCATGTACG 
TTCACATCCC 
TGGCTCATGA 
CTCCATGAGT 
GTCATCACCT 
TCCCCACCCA 
CCGGCCTCCG 
TGGACGGTGC 
GGGGATGGAA 
TGGGTGCTAA 
GGGGTCACTG 



CTACCATCTA 
AAGTCGCCAT 
ACAAGACGTT 
GGGCGCTGCT 
TCATCCTCAC 
CAGTGGCCGA 
TCATCGATGT 
TGGTCAGCTG 
GCTACTCGAA 
GCGCTNTCCC 
AGTCATGGCT 
CATTGGTGGA 
GCTGTGCTGA 
TGCAAGCTTC 



CACGATGTTT 
GTTGTATCCT 
CTTAATTTGG 
GCTGTTCGAG 
TGAGCTACTG 
GCTACTCAGC 
CTACTTCATT 
TCTCCCCCTC 
GCTCACTTCC 
CAGGAGGAGG 
GGCACATGAG 
ACCGGAGGGA 
GGGGGAAGAC 
CCTTATGGTT 



Figure 32B 
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1 MTDSIPLQPV RHKKRVDSRP RAGCCEWLRC CGGGEPRPRT VWLGHPEKRD QRYPRNVINN 
61 QKYNFFTFLP GVLFSQFRYF FNFYFLLLAC SQFVPEMRLG ALYTYWVPIiG FVIiAVTIIRE 
121 AVBBIRCYVR DKEMNSQVYS RLTSRGTVKV KSSNIQVGDL ILVEKNQRVP ADMIFLRTSE 
181 KNTGSCFLRTD QLDGETDWKL RLPVACTQRL PTAADLLQIR SYVYAEKPNI DIHNFLGTFT 
241 REMSDPPISE SLSIENTLWA GTVIASGTW GWLYTGRKL RSVMNTSDPR SKIGLFDLEV 
301 NCLTKILFGA LVWSLVMVA LQHFAGRWYL QIIRFLLLFS NIIPISLRVN IiDMGKIVYSW 
361 VIRRDSKIPG TWRSSTIPE QLGRISYLLT DKTGTLTQNB MVFKRLHLGT VAYGLDSMDE 
421 VQSHIFSIYT QQSQDPPAQK GPTVTTKVRR TMSSRVHEAV KAIALCHNVT PVYESNGVTD 
481 QAEAEKQFED SCRVYQASSP DEVAIiVQWTE SVGLTLVGRD QSSMQLRTPG DQVLNLTILQ 
541 VFPFTYESKR MGIIVRDEST GEITFYMKGA DWMAGIVQY NDWLEEECGN MAREGLRVLV 
601 VAKKSLTEEQ YQHFEARYVQ AKLSVHDRSL KVATVIESLE MBMEIiLCIiTG VEDQLQADVR 
661 PTLETLRNAG IKVWMLTGDK LETATCTAKKT AHLVTRNQDI HVFRLVTNRG EAHLELNAFR 
721 RKHDCALVIS GDSIiEVCLKY YEYEFMEIjAC QCPAWCCRC APTQKAQIVR LLQERTGKLT 
781 CAVWDGGNDV SMIQESDCGV GVEGKEGKQA SLAADFSITQ FKHLGRLLMV HGRNSYKRSA 
841 ALSQFVIHRS LCISTMQAVF SSVFYFASVP LYQGFLIIGY STIYTMFPVF SLVLDKOVKS 
901 EVAMLYPELY KDLLKGRPLS YKTFLIWVLI SIYQGSTIMY GALLLFESEF VHIVAISFTS 
961 LILTELLMVA LTIQTWHWLM TVAELLSLAC YIASLVPLHE FIDVYFIATL SFLWKVSVIT 
1021 LVSCLPLYVIi KYLRRRFSPP SYSKLTS 



1 MTDNIPLQPV RQKKRMDSRP RAGCCEWLRC CGGGEARPRT VWLGHPEKRD QRYPRNVINN 
61 QKYNFFTFLP GVLFNQFKYF FNLYPIiLLAC SQFVPEMRLG ALYTYWVPLG FVLAVTVIRE 
121 AVEEIRCYVR DKEVNSQVYS RLTARGTVKV KSSNIQVGDL IIVEKNQRVP ADMIFLRTSE 
181 KKTGSCFLRTD QLDGETDWKL RLPVACTQRL PTAADLLQIR SYVYAEEPNI DIHNFVGTFT 
241 REDSDPPISE SLSIENTLWA GTWASGTW GWLYTGREL RSVMNTSNPR SKIGLFDLEV 
301 NCLTKILFGA LVWSLVMVA LQHFAGRWYL QIIRFLLLFS NIIPISLRVN LDMGKIVYSW 
361 VIRRDSKIPG TWRSSTIPE QLGRISYLLT DKTGTLTQNE MIFKRLHLGT VAYGLDSMDE 
421 VQSHIFSIYT QQSQDPPAQK GPTLTTKVI^l TMSSRVHEAV KAIALCHNVT PVYESNGVTD 
481 QAEAEKQYED SCRVYQASSP DEVALVQWTE SVGLTLVGRD QSSMQLRTPG DQILNFTILQ 
541 IFPFTYESKR MGIIVRDEST GEITFYMKGA DWMAGIVQY NDWLEEECGN MAREGLRVLV 
601 VAKKSLAEEQ YQDFEARYVQ AKLSVHDRSL KVATVIESLE MEMELLCLTG VEDQLQADVR 
661 PTLETLRNAG IKVWMLTGDK LETATCTAKN AHLVTRNQDI HVFRLVTNRG EAHLELNAFR 
721 RKHDCALVIS GDSLEVCLKY YEYEFMELAC QCPAWCCRC APTQKAQIVR LLQERTGKLT 
781 CAVGDGGNDV SMIQESDCGV GVEGKEGKQA SLAADFSITQ FKHLGRLLMV HGRNSYKRSA 
841 ALSQFVIHRS LCISTMQAVF SSVFYFASVP LYQGFLIIGY STIYTMFPVF SLVLDKDVKS 
901 EVAMLYPELY KDLLKGRPLS YKTFLIWVLI SIYQGSTIMY GALLLFESEF VHIVAISFTS 
961 LILTELLMVA LTIQTWHWLM TVAELLSLAC YIASLVFLHE FIDVYFIATL SFLWKVSVIT 
1021 LVSCLPLYVL KYLRRRFSPP SYSKLTS 
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^J^™S'''^'^™^°^'^TTAa^CCaiGCAATCCCAGGAc5?A?^ 
GGCCCAACGCTCACCACTAAGQTCCGGCGGACaVTGAGCy^GCCGCGTGC^^^ 

p^^o^^3^'^''™'=^'^^^t°^c«^cgtggaggaccSctgS^ 
^^^«^^S^°''^'=^^^^*^'^ctggc^tcaaggtttggatgctgaS^^ 
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1 GGGAAGCTGT TGCGCACCAC TTAGCTGGGA AGTGCGTTGC TCCCTGTTrr rn7,nnnn:^r.r, 

■nil 

■Hill 

B = iEi i™ ~i 1^ 

-~ lEII ---^^ sss 
1^,1 ~5 iii sss? 

^i^i =s? sss?^ 

ilii sss ^?sss 
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MAPKLSDSVE ELRAAGMQSP RNGQYAEASA LYERALRLLQ ARGSADPEEB SVTVQWoanr. 
YI.KDGNCTDC IKDCTSALAL VPFSIKPLLR RASAYBaSS "^^JS ^SS^S 
^^VB^^nJ^ AI,MDSLGPEW RLKLPPIPW PVSAQKRWNS £^^^^ 
I^RVPSAGDV ERAKALKEEG NDLVKKONHK KAIEKYSESL LCSSLESATY SnS^W 
S^SS'^ ^ALKLDGKN VKAPVRRAQA VKALKDYKSS LsSSSS 



Figure 40 



Figure 41 

i GGCACGAGGC ACCACACGGG GGAGGAAGGA AGGAGCTCCC AACTCGrrrn nnTnr^r-r.7.r.r, 
61 GGATGGCCCC CAAATTCCCA GACTCTGTGG AGGAGCTCCG SSSScS? S^n^SS 
121 TCCGCAACGG CCAGTACGCC GAGGCCTCCG CGCTCtIcS SSS^gSS JJSJSS^J 

181 aggcgcaagg ttcttcagac ccagaagaag aaagtgttct ctactc?S? c2?i??S^S 

241 GTCACTTGAA GGATGGAAAC TGCAGA6ACT GCAtSaaS 5tgSot?J? S^J^S^n^^^ 
^^S^^^"^^^^ CAGCATTAAG CCCCTGCTGC SS^SStc SSJSgS gSSgSS 
It^ nfJ™^^'' GGCCTATGTO QACTATAAGA CTGTGCTGCA GATTCATGA? SSacS 
421 CAGCCGTAGA AGGCATCAAC AGAATGACCA GAGCTCTCAT GGACTCGCTT GGrrrSSar^ 
l!i nr^^n^ C3CTGCCCTCA ATCCCCTTGG TGCCTGTT?? AgSgSg 
541 CCTTQCCTTC GGAGAACCAC AAAGAGATG6 CTAAAAGCAA ATCCAAA^ arnl^T^II 
601 CAAA6AACAG AGTGCCTTCT GCTGGGGATG TGGAGa£agC ^SSSfl? 
^^12^^^^ TGTAAAGAAG GGAAACCATA AGAAaScTa5 SaSSSSS ^SS^^ 
TAACCTGGAA TCTGCCACGT ACAGCAACAG AGCActHtGC ^AirTO^c? 
781 TGAAGCAGTA CACAGAAGCA GTGAAGGACT GCACAGAAGC CCTCAAGCTG GMrritln^ 

841 acgtgaaggc attctacaga cgggctcaag cccic^c aSS^S S??S?^rJ 
901 gctttgcaga catcagcaac ctcctacaga ttgagcctag gaatgStc?? g?™JaS 
lol^ r?irt^^ agtoaagcag aacctacact aaaaaccc^ 

inJ: cccagagaag ccatgggcca cctgctctgt gcccgctcct gSacSSg? 

tgagctctga agccccctcc tcaatccctt gatggcctcc cacccSSS 
llol ^^^^ taaactcagt gtagtcaaac aJJgaSSSS S?SS^?? 

nll^ CCACTAGAGC TAAGCGTGAA GCTGAAGCTC TGTCCCTATT CCCCCA^rr 

^It} ^°CTAGCTGA TCACACCAAC AGATCCTCAT CAGCAAA6CA TTTGgStS 

1321 GTGGGCTGCA GACTGAGTGC TGCCCTTGTA GCTTCCCCAG ACCCcSrrr ar^n^S^ 

1381 ATCTGAACAA CCTGAGCTCC TGGGCCGGGG TgSaSS^ ^S^J? 

1441 ATCCAAAGCA GCCTGTTGAG CTGGTTCTCC AGGGCTGCAG TCTciccJcS ^S^A^JS^S 

1501 CTGTCCCTGC CCTGTCCTGT CCTTGCACAG TCTCCTATGT CTGAGCCcS OTgSSJpJ? 

litn S^^n ^^^T^TGG GAAGQCAGAG CCCTGAcS? SS?§g5S J^SSSJ? 

ccttctgcag agaggcacct aagctgttta aagagcccag tgattgtgJ? 

lll^ ^^n^^^ AGAGGTGGGA GGGGGCAAGA GGCCTCCTTG GTCAGTg5?? StcSJtcJS 
^Itl QGCAGGGACT TGGTTTTTTG TTCCAACAGT GGCCTTCTCC GGGCTTCATA GTTCTTTO™ 
nil wl^^^^ GTTAATTTGA ATTGACTGAT TTTGTTGAAC TGTGTG™ S^SS^S 
1861 TTAAAAAGCT TTCTTCTACA TGAATATCTG CTGTGCTTTC ATTTATQCCT TTTCAG^ 
nil ^ZSS^ CTCTGTAGTA ATAATAAAAG TTATTGCTTA JSSSJSc SJSSJS 
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MAPKVSDSVE QLRAAGNQNP. RNGQYGEASA 
YLKDGNCTDC IKDCTSIU^AL VPFSIKPLIiR 
ALBGINRITR ALMDSLGPEW RLKLPPIPW 
KSRVPSA6DV ERAKALKEEG NDLVKKGNHK 
KQYKEAVKDC TEALiOiDGKN VKAFYRRAQA 
RQEVNQNMN 



LYERALRLLQ ARGSADPEEE SVLYSNRAAC. 
RASAYEALEK YALAYVDYKT VIiQIDNSVAS 
PVSAQKRWNS LPSDNHKETA KTKSKEATAT 
KAIEKYSESL LCSSLESATY SNRAIiCHIiVL 
YKALKDYKSS LSDISSLLQI EPRNGPAQKL 



Figure 43 



MAPKFPDSVE ELRAAGNESP RNGQYAEASA 
HWKNGNCRDC IKDCTSAIiAL VPFSIKPIiLR 
AVEGINRMTR ALMDSLGPEW RLKLPSFPLV 
KNRVPSAGDV EKARVLKEEG NELVKKGNHK 
KQYTEAVKDC TEALKLDGKN VKAFYRRAQA 
RQEVKQNLH 



LYGRALRVLQ AQGSSDPEEE SVLYSNRAAC 
RASAYEALEK YPMAYVDYKT VLQIDDNVTS 
PVSAQKRWNF LPSENHKEMA KSKSKETTAT 
KAIEKYSESL LCSNLESATY SNRALCYLVL 
HKALKDYKSS FADISNLLQI EPRNGPAQKL 



Figure 44 
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1 GACTGGCTGG TGCGGGAAAT ATGCAGGAGA AAAGTCTTTG CATAATGTAG AGCGAGCCGT 
61 GGGGCTCCGG GAGCGGCGCC CCAAGGTCTG GGGCCATGAA CGCGAGCGTG GAAGGAGACA 
121 CCTTTTCTGG ATCGATGCAA ATCCCAGGAG GCACCACGGT CGTGGTGGAG CTGGCACCGG 
181 ACATCCACAT CTGCGGCCTC TGTAAGCAGC ACTTCAGCAA TCTGGATGCC TTTGTGGCCC 
241 ACAAACAGAG CGGCTGCCAG CTGACTACCA CGCCGGTGAC AGCCCCCAGC ACGGTCCAGT 
3 01 TT6TGGCAGA GGAGACAGAG CCTGCCACCC AGACCACCAC AACGACCATC AGTTCAGAGA 
361 CTCAGACTAT CACAGTTTCA GCTCCAGAGT TCGTCTTTGA ACATGGCTAC CAAACTTACC 
421 TGCCCACGGA GAGCACTGAC AACCAGACAG CCACCGTGAT CTCTCTCCCC ACCAAGTCAC 
481 GCACCAAAAA GCCCACAGCA CCCCCTGCTC AGAAGAGACT CGGCTGCTGC TATCCAGGTT 
541 GCCAGTTCAA GACCGCCTAT GGCATGAAGG ACATGGAGCG ACACCTGAAG ATCCACACCG 
601 GTGACAAACC CCACAAGTGT GAGGTGTGCG GGAAGTGCTT CAGCCGGAAG GACAAGCTGA 
661 AGACGCACAT GCGCTGCCAC ACGGGCGTCA AGCCCTACAA GTGCAAGACG TGCGACTACG 
721 CGGCGGCGGA CAGCAGCAGC CTTAACAAGC ACCTGCGCAT CCACTCGGAC GAGCGACCCT 
. 781 TCAAGTGCCA GATCTGTCCC TACGCCAGCC GCAACTCCAG CCAGCTCACC GTGCACCTGC 
841 GCTCGCACAC GGGGGACGCC CCCTTCCAGT GCTGGCTCTG TAGTGCCAAG TTCAAAATCTA 
901 GCTCGGACTT GAAAAGGCAC ATGCGTGTGC ACTCGGGGGA GAAGCCTTTC AAGTGCGAAT 
961 TCTGCAATGT CCGCTGTACC ATGAAGGGQA ACCTCAAATC GCACATCCGC ATCAAQCACA 
1021 GTGGGAATAA CTTCAAGTGT CCGCACTGCG ACTTCCTGGG TQACAGCAAA TCCACCCTGC 
1081 GGAAGCACAG TCGCCTGCAC CAGTCGGAGC ACCCGGAGAA GTGTCCCGAG TGCAGCTACT 
1141 CCTGTTCCAG CAAGGCCGCG CTGCGCGTGC ACGAGCGCAT CCACTGCACC GAGCGCCCGT 
1201 TCAAGTGCAG CTACTGCAGC TTCGATACCA AGCAACCCAG CAACCTGAGC AAGCACATGA 
1261 AGAAGTTCCA CGCCGACATG CTCAAGAACG AGGCTCCGGA GAAGAAGGAG AGCGGCAGGC 
1321 AGAGCAGCCG GCAGGTGGCC AGGCTGGATG CCAAGAAGAC GTTCCACTGC GACATCTGTG 
1381 ACGCCTCGTT TATGCGGGAG GACTCGCTCC GCAGCCACAA ACGGCAGCAC AGTGAGTACC 
1441 ACAGTAAGAA CTCGGACGTG ACTGTAGTAC AGCTTCACCT TGAACCCAGC AAGCAGCCGC 
1501 TGCGCCCCTC ACCGTAGAGC AAATCCAGGT CCCCCTCCAG TCCAGCCAGG TGCCCCAGTT 
1561 CAGCGAGGGO AGGGTCAAGA TCATCGTGGG GCATTACAGG TGCGTCAGAC GAACCGCCAT 
1621 AGTCCAAGCG GCCGCAGCTG CCGTCAACAT TGTGCCCCCC ACCCTGGTAG CCCAGACCCC 
1681 AGAGGAGATC CCAGGGAACG GCCGGCTACA GATCCTTCGC CAGGTCAGTC TCATTGCCCC 
1741 TCCTCAGTCC TCCGGGTGTC CCGGCGAAGC AGGTGCCCTG AGTCAGCCAA CTGTCCTGCT 
1801 GACCACCCAT GATCAGACGG CAGGGGCCGC CCTGCAGCAG GCTCTGATCC CCACCACCCC 
1861 GGTTGGGACC CAGGAAGGCA CGGGAAACCA GACATTCATT GCCAGTTCGG GCATCGTGCT 
1921 CGGACTTGGA AGGCCTTAAG CTCTATTCAG GAGGGAACGA CGGAAGTGAC TGTGGTGAGC 
1981 GATGGGGACC AGAGCATCGC AGTGGCCACC ACGGCACCCT CTATCTTCTC TACCCAGCAG 
2041 GAACTGCCCA AGCAGACTTA CTCCATCATC CACGGGGCGG CACACCCCGC CCTGCTCTGT 
2101 CCCGCCGACT CCATTCCTGA TTAGTCTGGA GGGAGGGGTG ACAGACAAGA CAAACTGCGA 
2161 GAGGAGTACT GTGAGAGGCT CCTGGTCCCG CATAAATAAT TGTATTTTAT. ACAGTTTATG 
2221 TAATTTTTTA ACAGGGTATC AAGCTGGAGA CCATTCTCCC TCAAGCTCTT GTTGATTGTG 
2281 TCTTAATGQT TACCAAGGCT QATTCCAATG TG6AGTTGGA ATTCACCACA GTAGGACTGA 
2341 ATACATTCGT TTGTTTTTCC ATGTTTAGGA TTTAATTTTT TTCAACOXSGA ATAAAGGAGT 
2401 TTGGQATTTG GGTTAAAAAA 



Figure 45 
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MNASVEGDTF SGSMQIPGGT TVWELAPDI 
VTAPSTVQFV AEETEPATQT TTTTISSETQ 
VISLPTKSRT KKPTAPPAQK RLGCCYPGCQ 
CFSRKDKLKT HMRCHTGVKP YKCKTCDYAA 
SSQLTVHLRS HTGDAPFQCW LCSAKPKISS 
KSHIRIKHSG NNFKCPHCDF LGDSKSTLRK 
RIHCTERPFK CSYCSPDTKQ PSNLSKHMKK 
KTPHCDICDA SFMREDSLRS HKRQHSEYHS 



HICGLCKQHF SNLDAFVAHK QSGCQLTTTP 
TITVSAPEFV FEHGYQTYLP TESTDNQTAT 
FKTAYGMKDM ERHLKIHTGD KPHKCBVCGK 
ADSSSLNKHL RIHSDERPPK CQICPYASRN 
DliKRHMRVHS GEKPPKCEPC ITVRCTMKGNL 
HSRLHQSEHP EKCPECSYSC SSKAALRVHE 
PHADMLKNEA PEKKESGRQS SRQVARLDAK 
KWSDVTWQL HLEPSKQPLR PSP 



MNASVEGDTF 
VTAPSTVQFV 
VISLPTKSRT 
CFSRKDKLKT 
SSQLTVHLRS 
KSHIRIKHSG 
RIHCTERPFK 
KTPHCDICDA 



SGSMQIPGGT 
AEETEPATQT 
KKPTAPPAQK 
HMRCHTGVKP 
HTGDAPFQCW 
NNFKCPHCDF 
CSYCSFDTKQ 
SFMREDSLRS 



TVWELAPDI 
TTTTISSETQ 
RLGCCYPGCQ 
YKCKTCDYAA 
LCSAKFKISS 
LGDSKSTLRK 
PSNLSKHMKK 
HKRQHSEYHS 



MNASVEGDTF 
VTAPSTVQFV 
VISLPTKSRT 
CFSRKDKLKT 
SSQLTVHLRS 
YPGCHFKTVH 
CDYAAVDSSS 
PKISSDLKRH 
LEHSRLHQAD 
DKVHREGAKT 
SDWGENKNSN 



SGSMQIPGGT 
AEETEPATQT 
KKPTAPPAQK 
HMRCHTGVKP 
HTASVLENDV 
GMKDLDRHLR 
LKKHLRIHSD 
MIVHSGEKPF 
HPBKCPECSY 
ENRAPPGKDG 
LVTPPSEGIA 



TVLVELAPDI 
TTTTISSETQ 
RLGCCYPGCQ 
YKCKTCDYAA 
QKPAGLPAEE 
IHTGDKPHKC 
ERPYKCQLCP 
KCEPCDVRCT 
SCSNPAALRV 
PGESGPHHVP 
TGQLGPIjVSV 



Figure 47 

HICGLCKQHF 
TITVSAPEFV 
FKTAYGMKDM 
ADSSSLNKHL 
DLKRHMRVHS 
HSRLHQSEHP 
FHADMLKNEA 
KNSDVTWQL 

Figure 48 

HICGLCKQHF 
TITVSAPEFV 
FKTAYGMKDM 
ADSSSLNKHL 
SDAQQAPAVT 
EPCDKCPSRK 
YASRNSSQLT 
MKANLKSHIR 
HSRVHCTDRP 
NVSTQRAFGC 
GQLESTLEPS 



SNLDAFVAHK 
FEHGYQTYLP 
ERHLKIHTGD 
RIHSDERPFK 
GEKPPKCEPC 
EKCPECSYSC 
PEKKESGRQS 
HLEPSKQPLR 



QSGCQLTTTP 
TESTDNQTAT 
KPHKCEVCGK 
CQICPYASRN 
NVRCTMKGNL 
SSKAALRVHE 
SRQVARLDAK 
PSP 



SNLDAFVAHK 
FEHGYQTYLP 
ERHLKIHTGD 
RIHSDERPPK 
LSLEAKERTA 
DNLTMHMRCH 
VHLRSHTGDT 
IKHTFKCLHC 
FKCDFCSFDT 
DKCGASPVRD 
HDL 



QSGCQLTTTP 
TESTDNQTAT 
KPHKCEVCGK 
CQICPYASRN 
TLGERTFNCR 
TSVKPHKCHL 
PFQCWLCSAK 
AFQGRDRADL 
KRPSSLAKHI 
DSLRCHRKQH 



MNASVEGDTF SGSMQIPGGT 
VTAPSTVQFV AEETEPATQT 
VISLPTKSRT KKPTAPPAQK 
CFSRKDKLKT HMRCHTGVKP 
SSQLTVHLRS HTAWRCDCLG 



Figure 49 

TVLVELAPDI HICGLCKQHF 
TTTTISSETQ TITVSAPEFV 
RLGCCYPGCQ FKTAYGMKDM 
YKCKTCDYAA ADSSSLNKHL 
STKPWVPSLV TT 



Figure 50 



SNLDAFVAHK QSGCQLTTTP 
FEHGYQTYLP TESTDNQTAT 
ERHLKIHTGD KPHKCEVCGK 
RIHSDERPFK CQICPYASRN 
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IWASS^QESF AGSVQXPQGT TVLVELTPDJ 
AAAPSTVQFV SEETVPATQT QTTTRTITSE 
ATVISLPAKS RTKKPTTPPA QKRLNCCyPG 
GKCFSRKDKL KTHMRCHTGV KPYKCKTCDY 
RNSSQLTVHIi RSHTGDAPFQ CWLCSAKFKI 
NLKSHIRIKH SGNNFKCPHC DFLGDSKATIi 
HERIHCTDRP FKCNYCSPDT KQPSNLSKHM 
AKKSFHCDXC DASFMREDSL RSHKRQHSfiY 
QVPLQPSQVP QPSEGRVKII VGHQVPQANT 
QILRQVSLIA PPQSSRCPSE AGAMTQPAVL 
QTPITSSGIT CTDPEGIiNAL IQEGTABVTV 
YSIIQGAAHP ALLCPADSIP D 



HICGICKQQF WNLDAPVAHK QSGCQIiTGTS 
TQTITVSAPE FVFEHGYQTY LPTESNENQT 
CQFKTAYGMK DMERHLKIHT GDKPHKCEVC 
AAADSSSLNK HLRIHSDERP PKCQICPYAS 
SSDLKRHMRV HSGBKPFKCE PCNVRCTMKG 
RKHSRVHQSE HPBKCSECSY SCSSKAALRI 
KKPHGDMVKT BALERKDTGR QSSRQVAKLD 
NESKBrSDVTV LQFQIDPSKQ PATPLTVGHL 
IVQAAAAAVN IVPPALVAQN PEBliPGNSRIi 
LTTHEQTDGA TIiHQTIiIPTA SGGPQEGSGN 
VSDGGQNIAV ATTAPPVPSS SSQQELPKQT 



Figure 51 



MNASSEGESF AGSVQIPGGT TVLVELTPDI 
AAAPSTVQFV SEETVPATQT QTTTRTITSE 
HKCEVCGKCF SRKDKLKTHM RGHTGVKPYK 
ICPYASRNSS QIiTVHIjRSHT GDAPFQCWIiC 
RCTMKGNLKS HIRIKHSGNN PKCPHCDPLG 
KAALRIHERI HCTDRPFKCN YCSFDTKQPS 
QVAKLDAKKS FHCDICDASF MREDSLRSHK 
LTVGHLQVPL QPSQVPQFSE GRVKIIVGHQ 
PGNSRLQILR QVSLIAPPQS SRCPSEAGAM 
QEGSGNQTPI TSSGITCTDP EGLKTALIQEG 
ELPKQTYSII QGAAHPAIiLC PADSIPD 



HICGICKQQF NNIjDAFVAHK QSGCQLTGTS 
TQTITGCQPK TAYGMKDMER HLKIHTGDKP 
CKTCDYAAAD SSSLNKHLRI HSDBRPPKCQ 
SAKFKISSDL KRHMRVHSGB KPFKCEFCa^V 
DSKATLRKHS RVHQSEHPEK CSECSYSCSS 
NliSKHMKKFH GDMVKTEALE RKDTGRQSSR 
RQHSEYSESK NSDVTVLQFQ IDPSKQPATP 
VPQANTIVQA AAAAVNIVPP ALVAQKTPEEL 
TQPAVLLTTH EQTDGATLHQ TIiIPTASGGP 
TAEVTWSDG GQNIAVATTA PPVFSSSSQQ 



Figure 52 



MNASSEGESF AGSVQIPGGT TVIiVBIiTPDI 
AAAPSTVQFV SEBTVPATQT QTTTRTITSE 
ATVISLPAKS RTKKPTTPPA QKRI*NCCYPG 
GKCFSRKDKXi KTHMRCHTGV KPYKCKTCDY 
RNSSQLTVHL RSHTGDAPFQ CWLCSAKFKI 
NLKSHIRIKH SGNNFKCPHC DPLGDSKATL 
HERIHCTDRP FKCNYCSPDT KQPSNLSKHM 
AKKSFHCDIC DASFMREDSL RSHKRQHSEY 
aVPLQPSQVP QFSEGRVKII VGHQVPQANT 
QILRQVSLIA PPQSSRCPSE AGAMTQPAVL 
QTPITSSGIT CTDFEGLNAL IQBGTAEVTV 
YSIIQGAAHP ALLCPADSIP D 



HICGICKQQF NNLDAFVAHK QSGCQLTGTS 
TQTITVSAPB FVFEHGYQTY LPTESNENQT 
CQFKTAYGMK DMERHLKIHT GDKPHKCEVC 
AAADSSSLNK HLRIHSDERP FKCQICPYAS 
SSDLKRHMRV HSGEKPPKCE FCNVRCTMKG 
RKHSRVHQSE HPEKCSECSY SCSSKAALRl 
KKFHGDMVKT EALERKDTGR QSSRQVAKLD 
SESmSDVTV LQFQIDPSKQ PATPLTVGHL 
IVQAAAAAVN IVPPALVAQN PEELPGNSRL 
LTTHEQTDGA TLHQTLXPTA SGGPQISGSGN 
VSDGGQNIAV ATTAPPVFSS SSQQELPKQT 



Figure 53 
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Orthoiog identification module 
eQTL identification module 
cQTL identification module 
Detemiination module 
Classification module 
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FIG. 54 

47/91 




48/91 



wo 2004/061616 



PCT/US2003/041613 





© is 

ft « 





49/91 



wo 2004/061616 



PCT/US2003/041613 




50/91 




51/91 



wo 2004/061616 



PCT/US2003/041613 




52/91 




53/91 




54/91 




55/91 



wo 2004/061616 PCT/US2003/041613 




56/91 




58/91 



wo 2004/061616 ^ ^ PCT/US2003/041613 




59/91 



wo 2004/061616 



PCT/US2003/041613 



>^ 

T— I 

a 
o 

o 
o 

C/} 

<: 



W3 



o 

IS 



^3 ♦ r2 

a 

GO 

O 



o o 

03 



a:> CO 

O CI 
00 



CO 

5=3 
O 



C3 



CI. 



^ ;^ 

>: « « 

5 i ^ 



I/) 

o 



O 
I 

d 
o 



r— » 

o 
o 

CO 

rt 
o 
P. 



a 








s 

CO 

♦-4 CO 

a I 

CO 

^ CO 

CO 2^ 



so 
Pan 



60/91 




61/91 



wo 2004/061616 ^ ^ PCT/US2003/041613 



6902 



6906 



Identify human population, collect femily- 
based sample with disease related 
phenotypes, relevant tissue(s) for expression 
profiling and genome- wide gpnolyping data 
P EG 




6904 



Identify inbred strains discordant for 
phenotype of interest, construct, phenotype 
and genotype genetic cross, collect relevant 
tissue(s) for expression profiling 
P EG 



Profile individuals to identify a 
! associated pattern 
P + E=:^ DAP 



6910 



Evaluate underlying genetics of 
genes involved in pattern 
DAP + G 



6914 



6912 



iatersect genetics of pattern with 
genetics of disease related traits to 
identify key drivers 
P + G n DAP +G=^> D 



Validate D througji association study in an 
independent sample 



6916 



Validate D through advanced crosses, 
congenic strains or similar model systems 



Use synteny to inform selection of 
targets 

I 



I Identify subset of dntggable targets 



6920 
6922 



Validate targets via knock-out 
models and/or RNAi tecliniques 



Fig. 69 
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LOD Score Plots witii Normal Gene X Expression 




LOD score curve for gene X 

LOD score curves for genes Yl, 
Y2, Y3, and Y4 attributed to QTL 
(given by gene X) 

Physical location of gene X 



LOD Score Plots wi* Gene X KNAi Knockout 




LOD score curves for genes X, Yl, 
Y2, Y3, and Y4 after siRNA knock 
down of gene X 



Fig. 70 
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^7102 

Select a trait, optionally expose a portion of a 

plurality of organisms to a perturbation that affects tiie trait 

i • ^7104 



Measure gene expression / cellular constituent level data 50 
tn tiie secondary tissue of a plurality of organisms 46 

i ^7106 



Transform gene expression / cellular constituent level data 
50 into expression statistics 



t ^715 0 

Measure one or more phenotypes for all or a portion of the 



organisms 4 6 in the plurality of organisms 



— T ^fiQ^ 

Classify the plurality of organisms into distinct phenotypic 
groups based on the phenotypes exhibited by the organisms 

i ' ^7154 

irianVtfx, +1-.^ 1 : . r. ^ ■ J. 



Identify the phenotypic extremes for the subpopulation with 
respect to the trait under study or a phenotype related to the 
trait under study 



^7156 



Filter the cellular constituent data to Identify which cellular 
constituents discriminate the organisms Into the phenotypic 
extremes identified In step 7154 {e.g., application of a t-test) 



/715 8 



Optionally, reduce the number of cellular constituents from 
step 7156 using a reducing algorithm (e.g., stepwise 
regression, principal component analysis, a stochasitc 
search, efo.) 



J ^7160 



Optionally, cluster (e.g., k-means clustering) cellular 
constituents from step 71 58 (or step 71 56) to identify further 



subgroups within each phen otypic subpopulation 



7164, Fig. 71 B 
ilG. 71A 
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7160, Fig. 71 A 

C 



f .7164 



Use the set of cellular constituents Identified as 
discriminators between phenotypic extremes to build a 
classifier 



J ^7166 



Use tlie classifier to classify all or a substantial portion of the 
organisms In the population under study, thereby further 
refining the definition of the trait under study 



jg_ ^7168 



Perform quantitative genetic analysis on each subgroup of 
organisms defined by the classifier developed in step 7166 



FIG.71B 
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Phenotype 
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Plienotype 
IVI 


CC 
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0 • • 
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Organism 46-1 
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Level 
50-N-1 


• 99 


Level 
50-N-Z 
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Fig. 74 
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Fig. 76 



©• 



OFPM 



Test for 



HSD1 



Causality 




Q 



1 — ► 


HSD1 




OFPM 




► 



Fig. 77 
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Fig. 78 

75/91 



wo 2004/061616 ^ PCT/US2003/041613 



Genotype a population under study andj optionally, use pediqree 
information for the population 



7902 



I 



. ^ ^ 7904 

Phenotype the population with respect to a trait or traits of interest 
and map quantitative trait loci (cQTL) for each phenotype, resultinq 
in a set of cQTL linked to the trait ^ 



J ^-7906 



Obtain abundance data for a plurality of cellular constituents from 
one or more tissues in each member of the population under study 



J ^7908 



dentify cellular constituents (association set D) whose abundance 
levels accross the population significantly associate with the trait of 
interest (e.g., by use of Pearson con-elations, basic discriminant 
analysis, regression models, etc.) 



; I ^79 10 

For each cellular constituent i in association set D, perform 
quantitative genetic analysis, in which abundance levels of cellular 
constituent i across the population serve as a quantitative trait in 
order to identify eQTL for cellular constituent i 



I 



^ ^7912 

Remove all cellular constituents from association set D that do not 
have at least one eQTL that is coincident with (within a support 
interval of) a cQTL for the trait of interest in order to form the 
candidate causative cellular constituent set. Optionally require 
that all coincident eQTL/cQTL pass a pleiotropy test in order to be 
considered coincident. Cellular constituents removed from 
association set D form a candidate reactive cellular constituent 
set. 




7914 



BIG. 79A 

76^1 



wo 2004/061616 



PCT/US2003/041613 



^7916 



For each cellular constituent i in the candidate causative cellular 
constituent set, determine the amount of genetic variation in the 
trait of interest that is explained by the eQTL of cellular constituent 
i coincident with the cQTL from the trait of interest. Rank order the 
cellular constituents In the candidate causative cellular constituent 
set based upon the amount of genetic variation in the trait of 
interest that is explained by each cellular constituent determined in 
this manner. 



I 



7918 



For each eQTL of each cellular constituent I in the candidate 
causative cellular constituent set, test for the relationship: 

P{T,Q\G) = P{T\G)P{Q\G) 

where, 

Tis variance in the trait of interest, 

Q is variance in the genome at the position where the eQTL 
(assoicated with cellular constituent i) overlaps with a cQTL linked 
to the trait of Interest, and 

G is variance in the abundance level of cellular constituent i 



I 



7920 



Optionally, determine whether each cellular constituent i in the 
candidate causative cellular constituent set includes a draggable 
domain 



I 



^792 2 



Optionally, rank cellular constituents in the candidate causative 
cellular constituent set based on the rank assigned in step 716 and 
the results of step 7918 and/or step 7920 



I 



7924 



Optionally, validate top ranking cellular constituents using gene 
knock outs/ins, transgenic construction, siRNA, drug treatments 
targeting candidate genes, time series experiments, etc. 



FIG. 79B 
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Phenotypic statistic set for clinical trait 1 
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Abundance / genotype warehouse 
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8104-G-^ 

Abundance statistic for g ene G from organism 
Abundance statistic for gene G from organism 
Abundance statistic for gene G from organism 
Abundance statistic for gene G from organism 



Abundance statistic for gene G from organism 
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Abundance / genotype warehouse 
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Abundance statistic set 1 
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Abundance statistic 1 / Tissue b 
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Lod Score Curve for INS Chromosome 13 QTL 
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Lod Score Curve for EPIPA Chromosome 13 QTL 
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Lod Score Curve for LEP Chromosome 13 QTL 
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Lod Score Curve for CHDL Chromosome 13 QTL 
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1 mdvvdsllvn gsnitppcel glenetlfcl 

61 Ivitvlirnk rmrtvtnifl Islavsdlml 

121 mgtsvsvstf nlvaislery gaickplqsr 

181 nlvpftknnn qtanmcrfll pndvitiqqswh 

241 kfeasqkksa kerkpsttss gkyedsdgcy 

301 saanlmakkr virmlivivv Ifflcwmpif 

361 tsscvnpiiy cfmnkrfrlg fmatfpccpn 

421 msasvppq (SEQ ID NO: 30) 



dqprpskewq pavqillysl ifllsvlgnt 
clfcmpfnli pnllkdfifg savcktttyf 
vwqtkshalk viaatwclsf timtpypiys 
tflllilfli pgivmmvayg lislelyqgi 
Iqktrpprkl elrqlstgss sranrirsns 
sanawraydt asaerrlsgt pisfilllsy 
Pgppgargev geeeeggttg aslsrfsysh 



Fig. 88 
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